Skip to content

Conversation

@spypsy
Copy link
Member

@spypsy spypsy commented Dec 17, 2025

Fixes A-354

Integrates the existing validator HA signer into validator-client via a new HAKeyStore wrapper, adds type-safe duty identifiers using discriminated unions, and includes integration testing with multiple validator nodes.

What's New

1. HAKeyStore Implementation

New File: yarn-project/validator-client/src/key_store/ha_key_store.ts

Transparent wrapper around ExtendedValidatorKeyStore that adds HA coordination:

  • Drop-in replacement for any ExtendedValidatorKeyStore implementation
  • AUTH_REQUEST duties bypass HA coordination (safe to sign multiple times)
  • Signs with all addresses concurrently using Promise.allSettled(), filtering out already-signed duties
// With HA protection
const sigs = await haKeyStore.signTypedData(typedData, {
  slot: 100n,
  blockNumber: 50n,
  blockIndexWithinCheckpoint: 0,
  dutyType: DutyType.BLOCK_PROPOSAL,
});

// Without HA protection (AUTH_REQUEST)
const sigs = await haKeyStore.signTypedData(typedData, {
  dutyType: DutyType.AUTH_REQUEST,
});

2. Type-Safe Duty Identifiers

Modified: yarn-project/validator-ha-signer/src/db/types.ts

Replaced optional blockIndexWithinCheckpoint?: number with discriminated unions:

type DutyIdentifier =
  | BlockProposalDutyIdentifier // MUST have blockIndexWithinCheckpoint >= 0
  | OtherDutyIdentifier; // Does NOT have this field

interface BlockProposalDutyIdentifier {
  validatorAddress: EthAddress;
  slot: SlotNumber;
  blockIndexWithinCheckpoint: number; // Required
  dutyType: DutyType.BLOCK_PROPOSAL;
}

Enforces correctness at compile-time and prevents invalid database primary keys.

3. Extended Duty Type Support

Added: CHECKPOINT_PROPOSAL, GOVERNANCE_VOTE, SLASHING_VOTE, AUTH_REQUEST

4. Integration Test

New File: yarn-project/validator-client/src/validator.ha.integration.test.ts

E2E test with 5 validators sharing a PostgreSQL database. Tests concurrent block proposals, attestations, checkpoint proposals, and slashing protection.

Key Changes

Cleanup Timeout Configuration

Made maxStuckDutiesAgeMs dynamically computed from actual slot duration:

const haConfig = {
  ...config,
  maxStuckDutiesAgeMs:
    config.maxStuckDutiesAgeMs ??
    epochCache.getL1Constants().slotDuration * 2 * 1000,
};

SigningContext Now Required

All ValidatorKeyStore signing methods now require SigningContext parameter. Updated LocalKeyStore, NodeKeystoreAdapter, Web3SignerKeyStore, and new HAKeyStore.

Database Schema

Added block_index_within_checkpoint to primary key to support multiple block proposals per slot:

PRIMARY KEY (validator_address, slot, duty_type, block_index_within_checkpoint)

Configuration

VALIDATOR_HA_DATABASE_URL=postgresql://user:pass@host:port/db
VALIDATOR_HA_SIGNING_ENABLED=true
VALIDATOR_HA_NODE_ID=validator-node-1
VALIDATOR_HA_POLLING_INTERVAL_MS=100        # Optional
VALIDATOR_HA_SIGNING_TIMEOUT_MS=3000        # Optional
VALIDATOR_HA_MAX_STUCK_DUTIES_AGE_MS=144000 # Optional (defaults to 2× slot duration)

Notes

  • HA is opt-in via VALIDATOR_HA_SIGNING_ENABLED=true
  • Non-HA validators work unchanged
  • Lock-free read operations
  • Atomic lock acquisition using INSERT ... ON CONFLICT with retry logic (10ms, 20ms, 30ms backoff) to handle PostgreSQL transaction visibility edge cases in high-concurrency scenarios

Base automatically changed from spy/ha to next January 8, 2026 10:35
return false;
}
// Re-throw unexpected errors
throw result.reason;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing from inside filter feels weird. I wonder what happens to the stack trace. It might be better to throw new Error('Unexpected error', { cause: result.reason }); to also make sure we're throwing errors

Later edit: the stack trace is maintained and points to the source of the error rather than the filter callback

) {
this.log = createLogger('validator-ha-signer');

if (!config.enabled) {
if (!config.haSigningEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be controlled at a higher level (ie. in the validator or the start_node.ts script) not in the constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is basically controlled at the validator factory (.new) but makes sense to have it here as a guard too?

@spypsy spypsy force-pushed the spy/ha-keystore branch 2 times, most recently from 46432ea to b9212d1 Compare January 14, 2026 17:33
@AztecBot AztecBot enabled auto-merge January 16, 2026 17:05
@AztecBot AztecBot added this pull request to the merge queue Jan 16, 2026
@spypsy spypsy removed this pull request from the merge queue due to a manual request Jan 16, 2026
@spypsy spypsy added this pull request to the merge queue Jan 16, 2026
@spypsy spypsy removed this pull request from the merge queue due to a manual request Jan 16, 2026
Fixes [A-354](https://linear.app/aztec-labs/issue/A-354/integrate-slash-protection-in-signing-path)

Integrates the existing validator HA signer into validator-client via a new `HAKeyStore` wrapper, adds type-safe duty identifiers using discriminated unions, and includes integration testing with multiple validator nodes.

## What's New

### 1. HAKeyStore Implementation

**New File**: `yarn-project/validator-client/src/key_store/ha_key_store.ts`

Transparent wrapper around `ExtendedValidatorKeyStore` that adds HA coordination:

- Drop-in replacement for any `ExtendedValidatorKeyStore` implementation
- AUTH_REQUEST duties bypass HA coordination (safe to sign multiple times)
- Signs with all addresses concurrently using `Promise.allSettled()`, filtering out already-signed duties

```typescript
// With HA protection
const sigs = await haKeyStore.signTypedData(typedData, {
  slot: 100n,
  blockNumber: 50n,
  blockIndexWithinCheckpoint: 0,
  dutyType: DutyType.BLOCK_PROPOSAL,
});

// Without HA protection (AUTH_REQUEST)
const sigs = await haKeyStore.signTypedData(typedData, {
  dutyType: DutyType.AUTH_REQUEST,
});
```

### 2. Type-Safe Duty Identifiers

**Modified**: `yarn-project/validator-ha-signer/src/db/types.ts`

Replaced optional `blockIndexWithinCheckpoint?: number` with discriminated unions:

```typescript
type DutyIdentifier =
  | BlockProposalDutyIdentifier // MUST have blockIndexWithinCheckpoint >= 0
  | OtherDutyIdentifier; // Does NOT have this field

interface BlockProposalDutyIdentifier {
  validatorAddress: EthAddress;
  slot: SlotNumber;
  blockIndexWithinCheckpoint: number; // Required
  dutyType: DutyType.BLOCK_PROPOSAL;
}
```

Enforces correctness at compile-time and prevents invalid database primary keys.

### 3. Extended Duty Type Support

Added: `CHECKPOINT_PROPOSAL`, `GOVERNANCE_VOTE`, `SLASHING_VOTE`, `AUTH_REQUEST`

### 4. Integration Test

**New File**: `yarn-project/validator-client/src/validator.ha.integration.test.ts`

E2E test with 5 validators sharing a PostgreSQL database. Tests concurrent block proposals, attestations, checkpoint proposals, and slashing protection.

## Key Changes

### Cleanup Timeout Configuration

Made `maxStuckDutiesAgeMs` dynamically computed from actual slot duration:

```typescript
const haConfig = {
  ...config,
  maxStuckDutiesAgeMs:
    config.maxStuckDutiesAgeMs ??
    epochCache.getL1Constants().slotDuration * 2 * 1000,
};
```

### SigningContext Now Required

All `ValidatorKeyStore` signing methods now require `SigningContext` parameter. Updated `LocalKeyStore`, `NodeKeystoreAdapter`, `Web3SignerKeyStore`, and new `HAKeyStore`.

### Database Schema

Added `block_index_within_checkpoint` to primary key to support multiple block proposals per slot:

```sql
PRIMARY KEY (validator_address, slot, duty_type, block_index_within_checkpoint)
```

## Configuration

```bash
VALIDATOR_HA_DATABASE_URL=postgresql://user:pass@host:port/db
VALIDATOR_HA_SIGNING_ENABLED=true
VALIDATOR_HA_NODE_ID=validator-node-1
VALIDATOR_HA_POLLING_INTERVAL_MS=100        # Optional
VALIDATOR_HA_SIGNING_TIMEOUT_MS=3000        # Optional
VALIDATOR_HA_MAX_STUCK_DUTIES_AGE_MS=144000 # Optional (defaults to 2× slot duration)
```

## Notes

- HA is opt-in via `VALIDATOR_HA_SIGNING_ENABLED=true`
- Non-HA validators work unchanged
- Lock-free read operations
- Atomic lock acquisition using `INSERT ... ON CONFLICT` with retry logic (10ms, 20ms, 30ms backoff) to handle PostgreSQL transaction visibility edge cases in high-concurrency scenarios
@AztecBot AztecBot enabled auto-merge January 16, 2026 18:50
@AztecBot AztecBot added this pull request to the merge queue Jan 16, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 16, 2026
@spypsy spypsy added this pull request to the merge queue Jan 16, 2026
@AztecBot
Copy link
Collaborator

AztecBot commented Jan 16, 2026

Flakey Tests

🤖 says: This CI run detected 2 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/0e3454790746ad9e�0e3454790746ad9e8;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_epochs/epochs_invalidate_block.parallel.test.ts "committee member invalidates a block if proposer does not come through" (100s) (code: 1) group:e2e-p2p-epoch-flakes (\033spypsy\033: feat: ValidatorKeystore implementation for high-availability signer (#19094))
\033FLAKED\033 (8;;http://ci.aztec-labs.com/27baf83195de5df3�27baf83195de5df38;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_epochs/epochs_invalidate_block.parallel.test.ts "proposer invalidates multiple blocks" (132s) (code: 1) group:e2e-p2p-epoch-flakes (\033spypsy\033: feat: ValidatorKeystore implementation for high-availability signer (#19094))

Merged via the queue into next with commit 0312309 Jan 16, 2026
18 checks passed
@spypsy spypsy deleted the spy/ha-keystore branch January 16, 2026 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants